ftp.cs.arizona.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.cs.arizona.edu / ftp.cs.arizona.edu.tar / ftp.cs.arizona.edu / icon / newsgrp / group95c.txt / 000090_icon-group-sender _Wed Nov 22 13:12:00 1995.msg < prev next >

Wrap

Internet Message Format | 1996-01-03 | 1KB

Received: by cheltenham.cs.arizona.edu; Wed, 22 Nov 1995 12:26:50 MST Message-Id: <9511221312.AA17401@ns1.computek.net> Mime-Version: 1.0 Content-Length: 932 Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Wed, 22 Nov 95 13:12 CST From: gep2@computek.net Subject: Parsing structured text To: icon-group@cs.arizona.edu X-Mailer: SPRY Mail Version: 04.00.06.17 Errors-To: icon-group-errors@cs.arizona.edu >Are there any general techniques for parsing text (and structured text in particular) that may be spread over several lines, e.g. HTML or LaTeX files? Most examples in the Icon book (1st edition) assume that the input can be dealt with a line at a time. What approach should I take when dealing with structured text, such as HTML or LaTeX, in which the entities to be parsed may be embedded inside other entities and may extend over several lines? Personally, the approach I tend to use in most cases like this is to create a super-line which consists of (say) three or four adjacent lines concatenated together. This way, I have the whole construct I want to parse in one string. On the other hand, if you have constructs that are several pages separated from each other, that sorta calls for alternative approaches, outside the scope of a mailing list posting. :-) Gordon Peterson http://www.computek.net/public/gep2/